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Abstract 

We propose a new inequality that we call the conditional ageing inequality (CAIN). 
The CAIN is a slight generalization to non-equilibrium situations of the Second Law 
of thermodynamics. The goal of this paper is to study the consequences of the CAIN. 
We use the CAIN to discuss Maxwell demon processes (i.e., thermodynamic processes 
with feedback.) In particular, we apply the CAIN to four cases of the Szilard engine: 
for a classical or a quantum system with either one or two correlated particles. Besides 
proposing this new inequality that we call the CAIN, another novel feature of this 
paper is that we use quantum Bayesian networks for our analysis of Maxwell demon 
processes. 
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1 Introduction 



In Ref . [T] , Maxwell proposed his famous gedanken experiment wherein a demon con- 
trols the flow of gas particles from one chamber to another and decides which particles 
to let through based on their temperature. He gave this thought experiment as an 
example of a thermodynamic process in which the Second Law of thermodynamics 
appears to be violated. He dismissed the paradox by saying that the Second Law 
is true only on average. In Ref. [2], Szilard proposed an engine which is a simplified 
version of a Maxwell's demon. Szilard argued that for his engine, the Second Law is 
not violated at all, as long as the work performed by the demon to make his measure- 
ments is taken into account. In Refs.[5] and Landauer and later Bennett pointed 
out that measurements can be performed without spending any energy, but that in 
order for an engine to perform a cyclic process, it needs to store the information of 
the measurement on a tape and then erase and re-initialize that tape once per cycle. 
These tape operations will always consume an amount of energy larger or equal to 
the work the demon can extract from changes in gas volumes. 

Maxwell's demon thought experiment might have once been considered para- 
doxical, but after the work of Szilard, Landauer and Bennett, most scientists consider 
the paradox pretty much solved. Nevertheless, some people, myself included, still 
strive to make the mathematics involved in the treatment of Maxwell's demon a bit 
more streamlined. That is one of the goals of this paper, to look at Maxwell's demon 
from a different point of view, hoping that this might yield new insights to an already 
understood problem. 

This paper originated as an attempt to understand a series of papers (Refs. [5] 
to [11]) by Sagawa, Ueda and coworkers (S-U) in which they claim that the standard 
Second Law of thermodynamics does not apply to non-equilibrium processes with 
feedback (i.e.. Maxwell demon type processes). They give a generalization of the 
Second Law that they claim does apply to such processes. Although I agree in spirit 
with much of what S-U are trying to do, and I profited immensely from reading their 
papers, I disagree with some of the details of their theory. I discuss my disagreements 
with the S-U theory in a separate paper, Ref. [12]. The goal of this paper is to report 
on my own theory for generalizing the Second Law so that is applies to processes with 
feedback. My theory agrees in spirit with the S-U theory, but differs from it in some 
important details. 

Let me explain the rationale behind my theory. 

Suppose we want to consider a system in thermal contact but not necessarily 
in equilibrium with a bath at temperature T. Let X denote all non-thermal variables 
(fast changing, not in thermal equilibrium) and let 6 denote all thermal variables 
(slow changing, in thermal equilibrium) describing both the system and bath. Let r 
denote time. For any operator fir, define flr\r=ri ~ ^r2— ^n- My slight generalization 
of the Second Law is 
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SAer\Kr)\r=0>0 



(1) 



where S{a\b) is the conditional entropy (i.e., conditional spread) of a given b. 
I call Eq.([T]) the conditional ageing inequality (CAIN). The standard Second Law- 
corresponds to the special case when there are no variables, in which case Eq.(IT]) 
reduces to 



The standard Second Law could be described as unconditional ageing, or simply as 
ageing. 

Now, what is the justification for the CAIN? The justification for the Second 
Law Eq. is that the superoperator that evolves the overall probability distribution 
in the classical case (or the overall density matrix in the quantum case), from time 
to r, increases entropy because it can be shown to be doubly stochastic in the classical 
case (or unita0 in the quantum case). The justification for the CAIN is the same, 
except that the evolution superoperator is doubly stochastic (or unital) only if the 
non-thermal variables are held fixed during the evolution. The CAIN is not true for 
all evolution superoperators. Our hope is that it applies to systems of interest that 
commonly occur in nature. 

The goal of this paper is to study the consequences of the CAIN. In particular, 
we apply the CAIN to four cases of the Szilard engine: for a classical or a quantum 
system with either one or two correlated particles. 

Besides proposing this new inequality that we call the CAIN, another novel 
feature of this paper is that we use quantum Bayesian networks for our analysis of 
Maxwell demon type processes. 

This paper is written assuming that the reader has first read Refs. |T3] and [H] . 
Ref.[13] is an introduction to quantum Bayesian networks for mixed states. Ref . |14j 
discusses well-known inequalities of classical and quantum SIT (Shannon Information 
Theory) from a Bayesian networks perspective. 

In this paper, we will use the abbreviation \ = v^^^^rr- We will also use 

the abbreviations Ta:b = (Xa, La+i, ■ ■ ■ , Lb) and r<5 = Fq ; for any vector 
and any integers a, b such that < a < 6. 

2 Review of Some Properties of 
Thermal States 

In this section, we will review some well known properties of thermal states that we 
shall use later on in the paper to study some consequences of the CAIN. Most of the 

superoperator is unital if it maps the identity matrix to itself. 



5.(e,)IU>o. 



(2) 
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contents of this section can be found in reviews about entropy such as Ref.[T5] by 
Wehrl and textbooks on Statistical Mechanics such as Ref.[T6] by Feynman. 

Suppose j is a classical random variable that can take on values j G Sj and 
has a probability distribution Pj{j)- We will denote the average of any function 
/ : M by 

(/(j)), = (/(j))p^=E^(^')/(^')- (3) 

j 

When speaking about quantum physics, if p is a density operator acting on a 
Hilbert space "H, and f2 is a Hermitian operator also acting on "H, we will denote the 
average of Q by 

(fi)^ = tr(p^]) . (4) 
For example, in this notation the von Neumann entropy of p is 

S{p) = -{\np)^ . (5) 

Consider a system with density matrix p and Hamiltonian e. Suppose the 
eigenvalue decomposition of e is e = P{j)\Ej){Ej\. The internal energy of the 
system is defined as 

U = E= {e)^ = {E,)^ . (6) 
2.1 Simple Properties of Thermal States 

Thermal states (a.k.a canonical ensemble or Gibbs states) are states with a definite 
temperature T. Their form is given below. 

In this paper, we will use what are called natural Planck units. As in Eq.([5]), 
our entropies will be defined in terms of natural logs (instead of base 2 logs) and 
without the ks- {ks is Boltzmann's constant.) Temperatures will be given in energy 
units and entropies in nats. If T is the temperature in energy units and T^^' is the 
temperature in degrees Kelvin, then T = ksT^'^K We will also use /3 = ^. 

Consider a system with Hamiltonian e = P{j)\Ej){Ej\. which has reached 
thermal equilibrium at a temperature T. The partition function of the system is 
defined by 



Z'^(^) = tr(e-^^) = 5^e-^^^-. (7) 



Its density matrix is 



p'i^) = Z^y where = e-^^ . (8) 
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Its entropy is 

S^{e) = S{p^{E)) . (9) 

Its free energy is 

F^{e) = -TlnZ/^iE) . (10) 
Its pressure P (not to be confused with probability -P(j)) is 

Later we will show that this expression for pressure gives the expected dE = —PdV 
(Thus, internal energy of system decreases if system does work by increasing its 
volume by dV). 



Claim 1 LetS = S^{e), E = {e)^^^^-^, and F = F^{e). Th 



en 



E = TS + F. (12a) 

(Thus, internal energy is sum of bound part (T times entropy) and free part (free 
energy)). Furthermore 

dF\ fdF\ 

-P, ^ =-S. (12b) 



(Thus, free energy decreases if volume or temperature increase). Furthermore 

dF = -SdT - PdV , (12c) 

and 

dE = TdS - PdV . (12d) 

proof: 

To prove Eq. fll2ap . note that 

S = (in = In Z + /3 (E,)^. = + f • (13) 



To prove Eq.( 112bll . note that 



T 
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KdTJv 



Z \dT)y 

-InZ = -S . 



To prove Eq.f ll2cp . note that 



QED 



dF\ 

To prove Eq.(Il2d|, just use Eqs.([l2a]) and ffT2cl) . 



(15a) 

(15b) 
(15c) 



(16) 



Claim 2 S^{e) and {e) ^pf^^^ are monotonically increasing and F^{e) is 
cally decreasing functions of temperature. In fact, 



monotoni- 



dSjp) _ d {e)^ 
Pdp 



E~{E)f) >0 



and 



dl3 

dF^{E) _ S{p) 
dp ~ 

where we are abbreviating p^{e) by just p. 

proof: Just straightforward Calculus. 
QED 

Claim 3 LetS = S^{e), E = {e)^^^^.^ and F = F^{e). Th 



(17) 



en 





T ^ 


T ^ oo 


s = 





\nN 


E = 


Eo 


■W E,- Ej 


F = 


Eq 


-T IniV 



(19) 



where {Ej}j^Q are the eigenvalues of e and Eq is the lowest one. 

proof: Obvious. 
QED 
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2.2 Inequalities Relating a Thermal State With a Neighbor- 
ing State 

Consider any Hilbert space any density matrix p acting on "H, any Hamiltonian e 
acting on "H, and any temperature T. Define 

S^E,p) = m,-F'{i^)] (20) 

and 

F\e,p) = {e)^-TS{p). (21) 

I will refer these functions as the S and F capping functions, respectively, because, 
as we will prove later, they are upper bounds to their namesakes. 

It's easy to check that 5^(b,p^(I;)) = S^{e) and F^{e,p^{e)) = F^^e). 

Claim 4 

D{p//P^E)) = S^iE,p)-Sip) (22a) 
= (3[FP{e,p)-F^{e)]. (22b) 

proof: 

D{p//p^E)) = (lnp-lnp^(^)>^ (23a) 



P[-TSip) + {E)-FP{E)] (23b) 



S^iE,p)-S{p) (23c) 
(3[F^iE,p)-F^{E)]. (23d) 



QED 
Claim 5 

S{p)<S^iE,p). (24) 

If{E)p < {E)pP(^Ey then also 

S{p) < S^iE) . (25) 

(Eg. I[2^} agrees with our intuition that {e)^ and S{p) both measure the energy spread 
of p.) 

proof: Eq.dMD follows from Eq.([22a]). 

If {e)p < {e)p^^e)^ then 

S{P) < f3[{E),-F^{E)] (26a) 

< m),^^E)-F'i^)] (26b) 
= S^iE) . (26c) 
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QED 
Claim 6 

Also 



F^{e) < F^{e,p) 



(27) 



F'{e) < (e)^ . (28) 
Thus, the free energy is always less than the average energy. (There is no free lunch.) 

proof: Eq.(|27D follows from Eq.(l22b]). 

Eq.(|28D follows from Eq.(|24]) and the definition of Sf^{E,p). 

QED 

Suppose El and E2 are two Hamiltonians acting on the same Hilbert space. If 
[ei,E2] = 0, then clearly Z^(l;i + E2) = Z^{ei)Z^{e2) so F^(bi + E2) = F^{ei) + 
F'^{e2). But what if Ei and Ei don't commute? Is the free energy sub-additive or 
super-additive (or neither) in its Hamiltonian? 

Claim 7 (Peierls-Bogoliubov^ 

F^{e2) < F^iEi) + {e2 - h),,^E,) ■ (29) 

proof: 



Dip^ih)//p'iE2)) = (lnp^(^i)-lnp^(^2)>^,(^^) (30a) 

= -S^{ei) + P{e2),,^e.)-^F^{e2) (30b) 

= -^^(^1) + /3 - PF^{e2) +P{e2- ^i)p.(430c) 

= f3[F^{h) - F^{e2)] +^{e2- . (30d) 



QED 
Claim 8 



(a) (b) 

F^{e) + (A^)^(^^^^) < F{e + Ae) < F^{e) + (A^)^(^) 



(31) 



•^This inequality is referred to as the Peierls-Bogoliubov inequality in the review by WehrlfTS], 
It's used in Feynman's Statistical Mechanics [TBI book to do variational approximations of the free 
energy. As shown here, it follows trivially from the monotonicity of the relative entropy, which was 
found by Uhlmann and others. 



8 



proof: 

Inequality (a) follows if one sets Ei = e + Ae and E2 = e in Eq.f l29|) . 
Inequality (6) follows if one sets Ei = e and E2 = e + Ae in Eq.f l29p . 

QED 
Claim 9 

Ff^{E) + F^^iAE) < F^{e + Ae) . (32) 

proof: Just use the no-free lunch inequality in Eq. (l3T]) side (a). 
QED 

3 The Conditional Ageing Inequality and Some of 
its Consequences 

In Appendix 1^1 we reminded the reader of the well know inequality dWg < —dFg, 
which says that at fixed temperature, the drop in free energy is an upper bound to 
the amount of work system s can do. In this section we apply the conditional ageing 
inequality to find: a lower bound on —dFx for a system X_ in contact with a heat 
reservoir at temperature T. 

We will abbreviate p^-.Q x by Pr- The partial traces of Pr-e .x "will be 
denoted by Pt^q^ and Pt;X^- We will also abbreviate Sr{-) = x (") any 

argument (■). 

Let the joint system of X. and G have as Hamiltonian 

Ee,,x^ = Ex^ + + e0^,x^ = Ex^ + Aeq^^x^ , (33) 

where \ex ,Eq ] = and ee x is small. 

The conditional ageing inequality (CAIN) is 

SriQr\X,Wr=0>0. (34) 

Besides the CAIN, we will also assume that the following is true at r = 0: 0g and 
X Q are independent and thermal. The independence is achieved by assuming that 

Claim 10 If the CAIN holds, and 0q and Xq are independent and thermal, then 
-F%Ex^,p^,xX=o-TS^{AEe^,x^,p^)\U<-F^{ExX=o (35) 

where 

- F^iEx^) = TS^Ex^ - (%.>,.(^^^) (36) 

and 

-F\Ex^,p^,x^)=TS{pr;x:) - (^xj^^^^^ . (37) 
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proof: 

The CAIN implies 

5.(xj|u<^r(e,,xj|u- (38) 

But 



= ^''(A£;e,,x,,p,) + 5^(l;x,,pO. 
Also, since Xq and Gg are independent and thermal, 



(39a) 
(39b) 

(39c) 
(39d) 



^0(60, Xo) = ^^(Eeo) + 5^(^X0) 

= 5^(l;eo,Po) + 5''(^x„,Po) 



Combining Eqs.(l38]), f l39dll and f liObj) yields 



(40a) 
(40b) 



Now using 



v;ir=o 



^'^(%.,p.)|;=o = /3K^x - F'^(%J]|;=o 



gives 



QED 



(41) 



(42) 



/3F/^(ex,,P.;xJ|;=o = ^(P.;xJlU-/3(^xJ^j;=o (43a) 

< 5^(AEe,,x.,Pr)i;=o-/3^^(^xJlU- (43b) 



4 Conditional Ageing in Terms of Time Reversal 

In this section, we will state the CAIN in terms of time reversal. The Second Law of 
Thermodynamics and it's generalization, the Jarzynski identity[l7], are often stated 
using time reversal ideas. This is a natural thing to do since they both describe 
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entropy changes and such changes arises from irreversible processes. The CAIN can 
be viewed as a shght generahzation of the Second Law, so it too should be stateable 
in terms of time reversal. 

For a good pedagogical treatment of time reversal, see, for example, ReffTS]. 

In classical physics, given a system of particles labeled hj fi = 1,2, . . . , N, 
if /({?7i,P/i}v/i) is a function of the positions and momenta p*^ of the particles, 
then the time reversal operator, which we will represent by ®, keeps the position 
vectors the same, but it reverses the velocities, and therefore the momenta. Thus 

In quantum mechanics, if we express all operators and wavefunctions in posi- 
tion and spin space, then [f{{r,^,P,j.}'iii)]® = f{,{r^„ -p^jlv^) still applies, where now / 
is either an observable or a wavefunction. The position operators are real and the 
momentum operators are pure imaginary. Thus, in the case of spinless particles, 
® can be taken to be simply complex conjugation *. If the particles do have spin, 
then one must also rotate the spin space part of /(•) by a matrix which is real, and 
therefore commutes with complex conjugation. See Ref. [T8] for more details on how 
to deal with spin. In this paper, we will only discuss the spinless case. 

This paper is mainly concerned with the time reversal of a simple Markov 
chain. For example, in later sections of the paper, we will model the classical Szilard 
engine by a CB net of the form 




(44) 



The time reversal of this network must look like this: 




The transition matrices for each node of the graph given by Eg. (1451) must be express- 
ible in some way, yet to be specified, in terms of the transition matrices for each node 
of the graph given by Eq.f H^ . Clearly, if we take = {sj, a^), then the CB net 
given by Eq.f l44p is a special case of the Markov chain CB net 

(S) (46) 

whose time reversal network looks like this: 




(47) 
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To agree with our intuition of how time reversal should operate, we stipulate thatH 



Pa*M^) = Pa,{a2) (48a) 
= 5^P(a2|ai)P(ai|ao)P(ao) , (48b) 



ai,ao 



Pa*\a*{ai\a2) = Pai|a2(ai|a2) (49a) 

Eao Pa,\a,ia2\ai)Pa^la,{ai\ao)Pa^^{ao) 

- v^— 7 ^ ) l^ybj 



and 



Pa*\aiiao\ai) = Pao\a^{ao\ai) (50a 

_ Pa^\ao{ai\ao)Paoi(^o) 



^ (riwm) 



(50b) 



For definiteness, we will continue to speak of a Markov chain with only 3 nodes. 
Generalization of our statements to the case of Markov chains with an arbitrary 
number of nodes is trivial. 



Claim 11 



proof: 



QED 



PaoM _ Pa*^3\a*,ia<3\a2) 



Pa^{a2) ^a<3|ao(a<3|ao) ' 



-Pa*3|a*(a<3|a2) _ Pa * | a * (^0 1 ^l) Pa * | a * (cH | ^2) 



^«<3l«o(^<3l'^o) Pa^\a^{.0'2\ai)Pa^\ao{.0'l\ao) 

_ -Pao|ai(ao|ai)^ai|a2(ai|a2) 



Pa^\a^{a2\ai)Pa^\a^^{ai\aQ] 

Pa, (as) ■ 



^Appendix IB] gives a specific example of tlie time reversal of a Markov chain. 



(51) 



(52a) 
(52b) 
(52c) 
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Now note that if we define S as H{Q^\ 2^^)\1^q, then 



Hie,\X,)\:=, (53a) 
l^ ^e,|Xo(9o|Xo) \ ^^^^^ 



0<3,-''^<3 



S) , (53d) 



where S is defined by 

In terms of the operator S, the CAIN can be stated as 

'sj) > . (55) 
In analogy to the Jarzynski equality, Eq. (!55l) probably generalizes to 

>-^\ = l. (56) 



Eq. fl56l) implies Eq. fl55|) plus much more. In fact, if we expand Eq. fl56|) in powers of 
S, we get Eq. fl55l) from the first order terms and a fluctuation dissipation theorem 
from the second order terms. 

This section has considered time reversal of the CAIN only for the classical 
case, but it can be generalized in a straightforward way to the quantum case. To 
go from the classical to the quantum case, one replaces CB nets by QB nets, and 
probability distributions by density matrices. Also classical information functions 
H{-) by quantum information functions Sp{-). 



5 Szilard's Engine 

The goal of this section is to apply Eq. fl35p to Szilard's heat engine. 

Eq. fl551) gives a lower bound on the drop —dFx in free energy for a system X 
that is in contact with a heat reservoir at temperature T. The left hand side of 
Eq. (l35l) is a sum of two terms, namely —F'^{ex^,Pt-x^)\t=o ~TS^{Aeq^^x^j PT)\l=i 
one for X and another "mostly" for . If we want to extract as much work as pos- 
sible from the system X, we want to make the term for 0, which is negative, as 
close to zero as possible. So let's assume that the term for can be made zero. This 
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means that the thermal variables must be "disturbed as little as possible" . According 
to Eq.f l37|) . the term for X_ is itself a sum of two terms, namely TS{pt-,x^)\t=o ^"^^ 
— {ex^)p ^ lr=o- c^se of the Szilard engine, the system is an ideal gas, so its 

internal energy is proportional to the temperature. But the temperature is the same 
for all r. Thus, we shall assume that the — (ex ) \l-n term is also zero. This re- 

duces what we need to calculate for the Szilard heat engine to just the TS{pr-x^)\l=o 
term. We will calculate this for certain special forms of the density matrix p^ix that 
seem good models for the Szilard engine. 

The usual Szilard engine is a simple version of Maxwell's demon wherein the 
system inside the box is just one particle. We will also consider a system of two 
particles. During the cycle of the engine, a partition is introduced inside the box, 
creating two compartments, and forcing the particle (or two particles) to choose sides. 
To model this situation, we will use the following random variables: 



s^,a_^,b (57) 

X d 

s,t,a,e,b (58) 

where 



s = system, first particle 

t = tyro (apprentice), second particle, if being considered 
a = sensor (probe, tape, memory), part of devil 
^ = thermal part of devil, at temperature T 
b = bath at temperature T 

X = s if uni-partite system, x = {s, t) if bi-partite system 
d = 9_) = devil. 

^ = ( 3i ) £ ) = non-thermal variables (fast changing, not in thermal equilibrium) 
= ( ^ 5 ^ ) = thermal variables (slow changing, in thermal equilibrium) 
We will consider four times r = 0, 1, 2, 3, where 



r = 0: initial time 



r = 1: time when measurement is done, when system and sensor interact 
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r = 2: time when feedback is done. Information encoded in the state of the sensor 
is used to modify the system. 

r = 3: time when system and sensor are erased and re-initiahzed. 

We will consider 4 cases: CI, Ql, C2, Q2, where C= classical, Q= quantum, 
1= uni-partite system, 2= bi-partite system. 

5.1 CI Case 

Consider the following CB net 




In this net: 

For the first row of random variables s^: P{sq) is arbitrary, P(si|so) = 
5(si,so), -P(s2|(Ji) is arbitrary, and Ps^iss) = Ps^iss). 

For the second row of random variables a^: -P(cro) = 6{(To,0), P{ai\so) is 
arbitrary, P{a2\cri) = 5(o"2,cri), and P{cr-^) = 5(cr3,0). 

Fig H] shows the position of the wall of a Szilard engine with this CB net. 



T=3 T=2 1=1 T=0 

re-initiaUze feedback admeasured start 




Figure 1: This Szilard engine is modeled by the CB net given by Eq.( l59|l . 
Define 

AH,oi = H{so)-H{s2\a,) . (60) 

The work done by the system when it changes its volume from Vq at time r = to 
V2 at time r = 2 is 

^W^oi = r dV P= r dV^ = Tln^ = T (in ^^^) = TAH^^i • (61) 

(We assume an ideal gas so PV = NksT but = 1 and we are setting fc^ = 1 so 
PV = T) 
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The following table is easy to verify using standard identities in Shannon 
Information Theory (specially the chain rule identities). 





system s 


system s + sensor a 


1 ^ 


= 




2 ^ 1 


H{Sr)\U = 


= -AH,oi + H{a^ : s^) 


3^2 


= AH^oi - H{s2 : g_]) 


= AH,,i-H{a,) 


0^3 


= 


= 



The entropy change over a full cycle is zero, as expected. For some of the r, 
the entropy change i7(s ^, (Jt-)|^=o contains "Landauer erasure- work terms" TH{ai), 
"Maxwell volume- work terms" TAHyoi, and even "correlation-energy terms" TH{a^ : 
s q) (these measure a sort of internal energy), but they all manage to cancel each other 
out over a full cycle. 

5.2 Ql Case 

In this case, we will abbreviate Pr — Pt;s ,q_ and St{-) — Sp^(-). Also, in this case. 

We begin by specifying the form of pr that we will assume for r = 0, 1, 2, 3. 
We will assume that the sensor random variable is a classical random 
variable for all r. Hence, = ( a^)ci for all r. 

• At time r — 0, 

Po = Y.Y.[^so \Xo)A{Xo, ro) ] [ h.c. ] , (63) 

ro cro 

where 

A{Xo,ro) = A{so,ro)S{ao,Q) , (64) 

and 

El^(^o,ro)P = l. (65) 
Pq can be represented as a QB net as follows: 
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Po 



tr, 



■(3) 



[ h.c. ] 



At time r — 1, 



ro Cl : 



\X,)A{X,\Xo) 
■ A{Xo, ro) 



[ h.c. ] , 



where 74(Xi|Xo) is an isometry. 

pi can be represented as a QB net as follows: 



Pi = tr^Q 



(j<)^(ro) 




• At time t — 2, 



P2 



EE 



ro 0-2:0 



\X2)A{X2\X,) 
A{Xo,ro) 



where 



(66) 



(67) 



[ h.c. ] . (68) 



[ h.c. ] , (69) 



A{X2\X,) = A{s2\s,,a,)Sia2,a,) 



and, for all Ui, 



h.c. 

si s[ 



^[^(S2|si,(7i) ] 

P2 can be represented as a QB net as follows: 



cl^i2 





(70) 



(71) 



[ h.c. ] . (72) 



• At time r = 3, 
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P3 



EEE 

R3,r3 ro 0-3:0 



E 



S3 : 



\Xs)A{Xs,R3,r,\X2) 
A{X2\Xr) 
A{X,\Xo) 
AiXo,ro) 



[ h.c. ] 



(73) 



where 



A(X3, R3, r,\X2) = A,^^r,{s3, r,)6{^s, 0)A{R,\s2, a^) (74) 

and A{R3\s2,(J2) is an isometry. Performing the sum over R^, Eq.f l73|l reduces 
to 



P3 = EE[^^3 |X3M^„,,,(X3,r3)][h.C. ] 

rs 0-3 



(75) 



P3 can be represented as a QB net as follows 

cU„ 



P3 



tr. 



^3-0:3) 



[ h.c. ] 



For r = 1,2, defin(fl 



A^2 = 5o(.o)-5.(^.ki) 



and 



(76) 



(77) 



(7J 



The following table is easy to verify using standard identities in Shannon 
Information Theory (specially the chain rule identities). 





system s 


system s + sensor a 


1 ^ 


= Si{s^) - So{sq) 


Sr{s^, a^)\l^o = 


2 ^ 1 


= + S2{S2 : £1) + So{s,) + S,{s,) 


= -A5S + ASi'J 


3^2 




= ASi'J,-H{a,) 


0^3 


= 


= 





79) 



^ In their papers (Refs.|5| to HT), Sagawa and Ueda introduce a quantity that they denote by 
Iqc and call the quantum-classical information. Their Iqc equals our AS"^^] 
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5.3 C2 Case 



In this case, x — {s, t), and X — {x, a). 
Consider the following CB net 





(80) 







In this net: 

Ps„,toiso,to) is arbitrary. Ps,„t_,{s3,t3) = i'.,„t„(s3, ts)- 

For the first row of random variables s^: P(so|so'^o) ~ ^(-Soj-So)' -Pl-Sil-So) = 
5(si,So), and P{s2\(Ji) is arbitrary. 

For the second row of random variables t^: P(to|'So,to) = 5(to,^o), -P(^il^o) = 
5{ti,to). and P{t2\ai) is arbitrary. 

For the third row of random variables a^: -P(co) = <^(co,0), P{ai\so) is arbi- 
trary, P(o-2|o-i) = d{a2,ai), and ^(0-3) = 5((T3,0). 

Define 



for /X = s,t,x. 

The following table is easy to verify using standard identities in Shannon 
Information Theory (specially the chain rule identities). 




(81a) 
(81b) 
(81c) 



Let 



(82) 
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1 • J. / J. \ 

bi-system x = [s^t) 


1 * J- / J. \ 1 

bi-system x = [s, t) + sensor a 


1^0 


TT / \ n 
^(^r)lr=0 = 






= 




2 1 




TT/ \\'2 




C A TT 1 ZU"/ . \ 

_ 1 -AH„oi,x + H{x2 ■■ g:^) 


C A TT 1 7" T ( . ^ \ 




[ +H{Sq : to) 


i +^(so : ^o) 


3^2 


















0^3 




Hix,,aXr=s- 




= 


= 



Note that some of the entropy changes contain a new kind of term TH{sq : 
ip), a "correlation-energy term" that measures a type of internal energy of the bi- 
partite system. 



5.4 Q2 Case 

In this case, wc will abbreviate pr — PT\s^,t^,q_^ ^-nd St{-) — Sp^(-) Also, in this case, 
X — {s,t), X = {x, a). 

We begin by specifying the form of p^- that we will assume for r = 0, 1,2,3. 
The form of pr is the same as that given for the Ql case, except that instead of 
X — (s, a) we have X — (s, t, a). 

For r — 1,2, define 

A^S,^ = Sr{s,)-Sr{s,\a,), (84a) 
A^S,, = SAto)-SAUa,), (84b) 
A^2,, = A^(;)^+A5(J, . (84c) 

Let 

XW^S.^^^^S. (85) 

for — s,t,x. 

The following tabic is easy to verify using standard identities in Shannon 
Information Theory (specially the chain rule identities). 
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bi-systcm x = [s , t) 


bi-system x = [s, t) + sensor a 


1 ^ 


= Si{x^) - So{xo) 


Sri 


Xr, O:r)\l=0 = 

-AS''^] +Hia,) 




Sri 

-\ 


^ -ASil^+S,{x,: a,) 

[ +Soiso: U) + So{xo)-Si{x,) 


Sri 

-\ 


Xr, ^r)\l=l = 

~AS^^i^^ + AS^^i^^ 
I +SoisQ : to) 


3^2 


Sri 

-\ 


Xr)\U = 

f ASll^-S,{x,: a,) 
. -So{so : to) 


Sri 

-\ 


X^, £j|?=2 = 

I -Soi-lo ■ to) 


0^3 


= 


Srix^, a J|^^3 = 
= 



(86) 



Note that just as in the C2 case, here too some entropy changes contain "correlation- 
energy terms" TSois^ : to) that measure a type of internal energy of the bi-partite 
system. 

A Appendix: Very Brief Review of Pertinent 
Classical Thermodynamics 

People with diverse backgrounds might find the results of this paper useful. Some 
of them might be rusty or uncomfortable in their knowledge of classical thermody- 
namics. To help those people out, here is a brief review of some facts about classical 
thermodynamics that are pertinent to this paper. 

As usual, Q = heat, E = U = internal energy, W = work, P = pressure, V — 
volume, 5" = entropy, T — temperature, F = free energy. 

Let X be any physical quantity pertaining to a system. If X is an actual 
function of the thermodynamical state of the system (i.e., a "state function"), we 
will use dX to denote a differential, infinitesimal contribution to X. If X not a state 
function, we will use dX to denote a non-differential, infinitesimal contribution to X. 

We will also use finite analogues of dX and dX. If X is a state function, let 
AX denote a finite difference, a finite change in X. If X is not a state function, let 
^X denote a finite contribution to X. 

We will also use a subscript of O (for instance, as in ArjX) to indicate that a 
change or contribution occurs over a full cycle of a cyclic process. 

The First Law of thermodynamics for a system s is 

dQs = dE, + dWs . (87) 
I like to represent it by a 3-port "circuit diagram" 
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dWs 



When considering more than one system, one can draw a 3-port circuit hke Eq.( l88|) 
for each system. Given several systems, any pair of them, say si and S2, might be 
in thermal contact, or in mechanical contact. Thermal contact (a wall that allows 
heat to flow across it from si to S2 or vice versa) can be indicated by drawing a line 
connecting the two a ports of the 3-port diagrams of Si and S2. Mechanical contact 
(a wall between si and S2 that is impermeable but free to move, thus making the 
volume of one system larger and the other smaller) can be indicated by drawing a 
line connecting the two c ports of the 3-port diagrams of Si and S2- In a sequence 
of steps called a "process", the thermal and mechanical contacts can change as a 
function of time. 

Here are some simple processes often considered in thermodynamics. 



(a) System s and bath h 
First Law: 



(89) 



dEh 



Second Law: 



Extra Constraints: 



dSs + dSb>0 



(90) 



TdSb (definition of heat bath) 



(91a) 



b + 



(thermal contact) 



TdS, 



dFs=-PdV 



(91b) 

(91c) 



dE, 
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(This is a circuit diagram of Eqs.( ll2cj ) and ( ]12d|) at constant temperature). 



Claim 12 



T 



(92) 



(Thus, the entropy of the system increases by as much or more than the heat/temperaure 
absorbed by system) and 

dWs < -dFs . (93) 

(Thus, the drop in free energy of the system is an upper bound to the amount 
of work the system can do.) 

For a cycle, AqS^ = so /(q^^Qs < 0. (No perpetuum mobile of the first kind.) 



proof: 



< dSs + dSh = dSg + 



dS., 



(94) 



Eq.( l93ll follows from the following facts: 



-dFs = TdSs - dEs 
dWs = dQs - dEs 
TdSs > dQs 



(95) 



QED 

(b) Hot bath h and Cold Bath c 
First Law: 



dQh dWh 



dE, 



dEr. 



(96) 



Second Law: 



dSh + dSc>0 



(97) 
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Extra Constraints: 



h — Tf^dSh (definition of lieat batli) 



(98a) 



= TcdSc (definition of heat bath) 



h + dQc — (thermal contact) 



(98b) 
(98c) 



(98d) 



Claim 13 dQc — —dQh > 0. (Thus, heat flows from hot bath to cold one). 



proof: 



QED 



0<^ + ?^=^(?e(^ + i 

J-c 



(c) Heat Engine 
First Law: 



dQh dWh 



dEh 



dQs dWs 



dE.^ 



dQc dWc 



dE, 



ThTc 



(99) 



(100) 



Second Law: 



dSh + dSg + dSc — 
(Equahty because assume quasi-static process) 



(101) 
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Extra Constraints: 



Claim 14 



proof: 



h = ThdSh (definition of lieat batli) (102a) 

TcdSc (definition of lieat bath) (102b) 

(thermal contact) (102c) 

Th > T, (102d) 

AoSs = AoEs = (one cycle) (102e) 



V T, 



^ ^ = — 7^ = efficiency (104) 

AoQc J-c 



AoS.^-AoS.^-f^l . (105) 



Ti W— II I rp 

r \ 1) 



^oWs =AiQs - AoEs =^oQs = -^oQh -XoQc = (^^^y^^XoQc (i06) 

QED 

As shown in FigJ21 the cycle of a Carnot engine consists of a rectangle in the 
T, S plane. The system s must first be brought (via an isentropic, adiabatic step) to 
the temperature of the hot bath bh (or the cold bath be), before it is put in contact 
with that bath or else there would be a temperature difference between the system 
and that bath which would make the process not quasi-static. 



B Appendix: Time Reversal of CI Case 

A simple exercise in time reversal is to find the time reversal of the CB net given by 
Eq.f lS^ . what we called the CI case of Szilard's engine. Recall X = {s,a). In our 
model, 

Px„(Xo) = <P,„(so), (107a) 
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Thermal Contact 



® (D © ® 



time 



1 




® 






1 



Figure 2: Cycle of a Carnot Engine. Steps 1 (baking s) and 3 (cooling s) are isentropic, 
whereas steps 2 and 4 are isothermal. 



Px,\x,{X^\Xo) = 6:iP^^\,^{a,\so) 



and 



(107b) 



Using Eqs.( ll07| l. it is easy to show that 



(107c) 



Px^(X2) = Px,(X2) = P,^l^^(s2|a2)5^P^j,„(a2|.o)Pso(3o) , (108a) 



so 



Pxi\x*{Xi\^2) = Px,\x,{Xi\X2) = 6, 



P^,\soi^2\Sl)Ps,{s,) 

^ (num) 



and 



(108b) 



CO 



Px',\xi{Xo\X,) = Px,\xMo\X,) = 
Thus the time reversed process has the following CB net 




(108c) 



(109) 
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C Appendix: Binary Symmetric Channels 

In this appendix, we will discuss some of the properties of binary symmetric channels. 
Many results in classical Shannon Information Theory simplify considerably when 
they are specialized to binary symmetric channels. For instance, the channel capacity 
is trivial to calculate for such channels. |19] 

Throughout this appendix, we will assume a, /3, 7, £ G [0, 1]. 

Define the complement of a by 

a = l-a, (110) 
and the symmetric product of a and /3 by 

a*/3 = a/3 + a^. (Ill) 

As shown in FigjSl the symmetric product has a simple geometrical interpre- 
tation in terms of areas contained in the unit square. 

1 



1 



b b 

Figure 3: The symmetric product a*h equals the shaded area within the unit square. 

One can easily check that the symmetric product is commutative and associa- 
tive: 

a* (3 = (3 * a 
a * (/3 * 7) = (a * /3) * 7 
Other useful properties of the symmetric product are 

a*/3 = l — a*(3 = a*/3 

and 

a * = a 

a * 1 = a 

1 1 
a * - = - . 

2 2 
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b — 



(112a) 
(112b) 

(113) 

(114a) 
(114b) 

(114c) 



a a 
a a 



Define a symmetric matrix by 

M{a) -- 

and a symmetric vector by 

One can easily check that 

M{a)v{e) = v{a * i) , 



(115) 



and 



M{f3)M{a) = M{i3 * a) . 
Define the binary entropy function h{a) by 

h{a) — —a Ina — alna . 



(116) 



(117) 



:ii8) 



(119) 



A binary symmetric channel is defined as the classical Bayesian net y x, 
where the transition matrix Py\x is of the form 





X — )■ 
1 


4, 


P{y\x) 


2/1 





This transition matrix is often represented by the diagram 



. 



(120) 




:i2ii 



1-a 

1-a 
1 



Note that the binary symmetric channel with Py\x = M{a) is doubly stochastic (the 
rows and columns of M[a) sum to one). It also satisfies 



Claim 15 



H{y\x) = (P,(0) + P,(1)) falni + aln^ 
— \ a a 

— h{a) . 

h{a *£)> h{e) . 



(122a) 
(122b) 

(123) 
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proof: As explained in Ref.|14]. if Ty |^ is a doubly stochastic transition matrix, then 
the monotonicity of the relative entropy implies that 

H{Tyj^P^)>H{P^) . (124) 

Now set Ty\x = M{a) and Px = v{C) 
QED 

Consider the model for the CI case of the Szilard engine which was described 
in Section Em Let us specialize that model by further assuming that P^^ = v{tj, 
Pa^\so = M{a) and Ps.,\g_^ = M{(3). Then the table given by Eq.(l62]) can be ex- 
pressed in terms of the probabilities i, a and /3 as follows: 





system s 


system s + sensor a 


1 ^ 


= 


= h{a) 


2^1 


= h{/3*a*i)- h{i) 


= h{(3) + h{a * I) - h{a) - h{£) 


3^2 


= -h{(3 * a * i) + h{i) 


= -h{(3) - h{a * + h{l) 


0^3 


= 


HUr,Zr)\'r=3 = 

= 
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